NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / dcom / modems-part1 / 9251 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 3.6 KB

Path: news-server.ncren.net!concert!ais!bruce From: bruce@ais.com Newsgroups: comp.dcom.modems Subject: Datacomm compression vs V.42bis compression (was Re: How come Z-Modem) Message-ID: <1996Mar26.100502.8753@ais> Date: 26 Mar 96 10:05:02 EST References: <4inro0$9mj@alcor.usc.edu> <4j7mp0$l33@sam.inforamp.net> Organization: Applied Information Systems, Chapel Hill, NC In article <4j7mp0$l33@sam.inforamp.net>, crs0794@inforamp.net (Geoffrey Welsh) writes: > > Oh, and yes... _full_ ZMODEM implementations include data compression, but > with data compression built in to any decent modem these days, that's not > very important. Relating to the subject of data compression inside a file transfer protocol, I have a question that I wonder if anyone has ever done the tests needed to come up with a definitive answer. Given a pair of modems that use a decent-sized V.42bis dictionary, using the common file transfer / data communications protocols (ZMODEM, Kermit, SLIP, PPP), can you achieve faster overall throughput by turning on compression in the protocol, or by turning it off and allowing the modem to do the compression by itself? Or does it matter much? Assume that DTE speed is much greater than connect speed so that you are not DTE speed limited. For the protocols that allow bidirectional traffic (SLIP and PPP), you should probably also limit the traffic to primarily one-way, with the return traffic being the minimum necessary to maintain the traffic flow. Also, the size of the V.42bis dictionary needed to be "decent" is likely to be another parameter in the equation, as is the type of data being send by the protocol (some data being more compressable than others). The point of the question is that, unlike compression algorithms such as the LZ algorithm used by GIF and ZIP, most file transfer and data comm protocols do compression by fairly simple "repeated byte elimination", that is, if you have a series of repeated bytes the algorithm will send these as something like: <REPEAT-INDICATOR> <REPEATED-BYTE> <REPEAT-COUNT> (typically 3 bytes), rather than as, for example, the single token that represents a string of bytes used by more sophisticated algorithms. Even if your file is fairly compressable using repeated byte elimination, you could find that the compression algorithm tends to defeat the dictionary used by LZ and V.42bis if there are a number of different combinations of <REPEATED-BYTE> and <REPEAT-COUNT> such that they tend to fill up the dictionary and crowd out potentially more useful strings. (Of course this is likely to require somewhat pathological data :->). Obviously you can also run into problems if the data contains a lot of the byte values used for the <REPEAT-INDICATOR> byte, so that you need to insert some kind of escape to change the normal meaning of that value, but let's assume that you're not sending random or pre-compressed data -- ie, that the data is text or executable or "normal" data files rather than ZIP or GIF files. In addition, the types of compression achieved by repeated- byte elimination are likely to be similar in their effect to what can be achieved by V.42bis, so that the usefulness of using both is not obvious. My _guess_ (and that's all that it is at this point) is that, assuming that you are not DTE-limited in some way, most of the time you'll find that it makes little or no detectable difference whether the DTE stream is compressed or not if the compression uses one of these simple schemes (formats such as ZIP _could_ achieve higher compression if they use a large enough dictionary). But has anyone actually run any tests on this? Bruce C. Wright